A taste of APHRODITE: an Architecture for False Positive Reduction

نویسندگان

  • Damiano Bolzoni
  • Sandro Etalle
چکیده

This work addresses the problems of automated response systems in Wireless Mobile Ad Hoc Networks (MANETs), which is challenging because the topology of these networks are dynamic and the fully distributed cooperation among nodes makes not only the false positives of intrusion alarms high but also complicates the response analysis. We present a cooperative, distributed and automated response model that addresses these challenges by establishing a secure communication channel and coordinating intrusion response among response agents. The first step of our automated response system (ARS) is alarm validation which is crucial in response systems because it can compensate for the false positives of intrusion systems and catch forged alarms. We investigate alarm validation in two aspects, local and global alarm validation. For a single, non-correlated attacker, the response system can use the majority-voting strategy among response agents to locally validate alarms. Furthermore, alarms can be globally validated using Public Key Infrastructure (PKI) technique. In order to perform alarm validation, response agents who have related alarms against the same node need to exchange their alarms with each other. However, how to perform this exchange efficiently without hopping through the suspicious node is a critical issue. Currently, we focus on developing a temporary coordinator to aggregate alarms from all response agents monitoring the same node. Through this efficient alarm exchange mechanism, response agents can reduce false positive rate, broadcast global alarms with low message overhead, and prevent false accusation locally and globally. In our future work, response agents can utilize this exchange framework to correlate alarms and determine cooperatively the best intrusion response among a feasible set of responses. Using the Dempster-Shafer theory for Traffic Classification Giovanni B. Barone, Claudio Mazzariello, Carlo Sansone Centro di Ateneo per i Servizi Informativi Dipartimento di Informatica e Sistemistica Università degli Studi di Napoli Federico II {gbbarone,cmazzari,carlosan}@unina.it When addressing the problem of network intrusion detection, it is often necessary to have a database of packets captured form areal network, whose actual class (i.e., normal or attack) is known. As an example, the employment of supervised pattern recognitiontechniques might be useful for discovering novel attacks; such techniques, however, might need some labeled traffic, containing bothnormal and attack packets, in order to learn traffic models for correctly detecting anomalous behaviors. A labeled database can bealso used for comparing the detection performance of different Intrusion detection Systems (IDS). To build suitable labeled datasets,it should be necessary to perfectly know the network environment and the traffic characteristics in advance. Furthermore, a manuallabeling phase is typically required, which is very time-consuming and annoying.In this work we propose an architecture for automatically building up a traffic database, made up of packets each labeled either asnormal or as an attack. Moreover, we describe a general approach for giving rise to an IDS that makes use of such labeled databasefor training itself. As we propose to use multiple detection techniques, it is necessary to employ some fusion strategy in order tocombine their outputs. As we have no prior knowledge of the traffic characteristics, we use the Dempster and Shafer [1] combinationrule, which completely disregards any prior knowledge about the data to be classified. The Dempster-Shafer combination rule, infact, was calculated regardless of the analyzed data and the classifiers employed. Though some works already propose to exploit theresults of such a theory for intrusion detection (see for example [2]), we propose also an architecture able to classify unlabelled datasets. As said before, possible use cases are the automated construction of a labelled dataset for training learning algorithms and theemployment in a real network scenario. The proposed architecture consists of two types of Intrusion Detection Modules, respectivelyMaster IDS (M-IDS) and Slave IDS (S-IDS). M-IDS must not require to be trained on labeled data. Typical examples of M-IDS aresignature-based IDS, such asSnort, or IDS based on unsupervised techniques. A bank of M-IDS starts analyzing offline thepackets contained in a dumped traffic database. No prior knowledge is needed about such traffic. According to the theory ofDempster and Shafer (D-S theory) [1], a Basic Probability Assignment (bpa) can be associated to each M-IDS. Each bpa describesthe subjective degree of confidence attributed to each different classification performed by the considered IDS, by means of priorhypotheses or observations. Starting from the category a M-IDS belongs to (i.e., signature-based, unsupervised, etc.), we definedsome criteria for suitably assigning bpa's to M-IDS. The D-S theory has been frequently applied to deal with uncertaintymanagement and incomplete reasoning. It is worth noting that it is different from the classical Bayesian theory, since for Bayes thesum of P(A) and P(¬(A)) always equals one, while this is not necessarily true according to the D-S theory. Moreover, the D-S theorycan explicitly model the absence of information, while in case of absence of information a Bayesian approach attributes the sameprobability to all the possible events. These characteristics can be very attractive for managing the uncertainties of our domain, due,for example, to the presence of the so-called zero-day attacks. Thus, each decision from each of the M-IDS will be supported by thebpa associated to it, which will express the degree of belief in the classifier decision. Such bpa's will be combined by using theDempster-Shafer combination rule [1], in order to obtain a bpa for each possible class the packet can belong to. We also defined acriterion for obtaining a crisp classification of each packet from the overall bpa, by means of a properly defined index. After asuitable number of packets has been classified with a satisfactory degree of confidence, such packets can be fed to each of the S-IDS.S-IDS are based on supervised learning techniques; they will use the portion of reliably labeled traffic as a training set. Once theirtraining is over, they will contribute to a further classification of packets according to what they learned so far. Once again, the D-Srule can be used for suitably combining the decisions of the banks of M-IDS and S-IDS. It is also possible to employ the wholesystem as an online IDS. The well known ability of supervised systems in generalizing known attack patterns for detecting novelattacks might in fact help in detecting attack which were not known when the statically configured M-IDS were set up.We evaluated the capability of the proposed architecture in correctly labeling a traffic database. In order to benchmark the systemperformance, we used the Lincoln Lab network traffic database. Though such a database has been heavily criticized in the past years,it is still referred to as a benchmark for data mining and pattern recognition techniques in intrusion detection. In particular, weevaluated the performance, in terms of false alarm rate and missed detections, obtained by combining two M-IDS (Snort and anunsupervised neural network) according to the proposed architecture and the D-S theory. We observed that the proposed system canimprove the performance of each base classifier alone in correctly labelling traffic data. Furthermore, we are interested indemonstrating that an S-IDS can be reliably trained by using traffic data labeled by the bank of M-IDS's. In order to prove it, wecompared the performance obtained by the chosen S-IDS (a rule-based one based on Slipper [4]) when trained with data labeled byour architecture with respect to those obtained when they are trained on data whose real labels are known in advance. The attaineddifference in terms of error rate turned out surely acceptable (only 0.07% with a recognition rate of 99.9%), so confirming ourclaims. Finally, we were also able to observe an interesting tradeoff between the attainable performance and the size of the trainingset used. Different training set sizes can be obtained by setting a suitable threshold on the index that furnishes an estimate of theaccuracy of the packet labels provided by the bank of M-IDS. The above described tests have been carried out by using the hardwareequipment of the University Federico II server farm. Further analysis on real data will be run on traffic sniffed at this server farm, inorder to make them significant for the deployment in a real enterprise scenario. References[1] J. Gordon, E.H. Shortliffe, “The Dempster-Shafer Theory of Evidence”, in B.G. Buchanan and E.H. Shortliffe (Eds.), Rule-BasedExpert Systems, Addison-Wesley, pp. 272-292, 1984.[2] T.M. Chen, V. Venkataramanan, “Dempster-Shafer Theory for Intrusion Detection in Ad Hoc Networks”, IEEE InternetComputing, pp. 35-41, November-December 2005.[3] W.W. Cohen, Y. Singer, “Simple, Fast, and Effective Rule Learner” in Proc. of the 16 National Conference on Artif. Intell. and11 Conference on Innovative Applications of Artif. Intell., July 18-22, Orlando, Florida, USA, pp. 335-342, 1999. Fig. 1. User interface of simulation environmentSimulation Environment for Investigation of Cooperative Distributed Attacks and DefenseIgor Kotenko, Alexander UlanovComputer Security Research Group, Saint-Petersburg Institute for Informatics and Automation (SPIIRAS){ivkote, ulanov}@iias.sbp.su Nowadays we are witnesses of increasing number of distributed attacks on global computer networks. Much of them areaimed on the distributed denial of service (DDoS) of critical information resources. These attacks are realized due to jointefforts of many malicious software components that are deployed on compromised Internet hosts. The general approach toDDoS defense includes mechanisms of attack prevention, detection, tracing the malicious traffic sources and attack coun-teraction. Because of gravity and complexity of DDoS the design of effective defense is a complicated scientific and tech-nical problem. It is sufficiently hard to examine and evaluate the effectiveness and efficiency of defense mechanisms inpractice. However these mechanisms might be simulated with the necessary fidelity and thoroughly analyzed.We propose the approach and developed software environment intended for simulation and investigation of distributedcooperative attacks and defense systems. The main attention is given to the integrated agent-oriented and packet-level ap-proach to the simulation of security processes in the Internet. It can provide the acceptable fidelity and scalability of im-plementing computer attacks and defenses. The special attention is given to cooperative distributed defense mechanismsthat are based on the deployment of defense components in various Internet subnets. That is intended for simulating theinteractions of various ISPs security elements.The cybernetic counteraction is supposed to be represented as the interaction of the teams of malefactors and the teamsof security agents. The agent teams can be opposed to each other or cooperate. Attack agents are subdivided at least intotwo classes: “daemons” and “masters”. To simulate distributed cooperative defense, the security agents belong to the fol-lowing classes: information processing (“samplers”); attack detection (“detectors”); filtering and balancing (“filters”);traceback and investigation (“investigators”).The proposed simulation approach presumes the followingcomponents of the simulation environment developed: (1)Discrete-event Simulation Framework (implemented onOMNeT++), (2) Internet Simulation Framework (using OM-NeT++ INET Framework), (3) Attack and Defense Framework(Library of attacks and defenses), (4) Multi-Agent SimulationFramework. Attack and Defense Framework includes attack anddefense modules and the modules that expand the hosts of INETFramework: filter table and packet analyzer.The basic window the simulation environment developed(Fig.1) shows a simulated computer network (hosts andchannels). Hosts can fulfill different functionality depending onchosen parameters and internal modules. Internal modules areresponsible for functioning of protocols and applications atvarious levels of OSI model. Applications (including agents) areestablished on hosts. The window for simulation managementallows looking through and changing simulation parameters. It isimportant that it is possible to see the events which are valuable for understanding attack and defense on time scale.Corresponding windows show the current status of agent teams. It is possible to open windows which characterize func-tioning of particular hosts, protocols, agents, defense methods, see contents of the packets, etc. The following parametersare used in the environment to define the attack: victim type; type of attack; attack rate dynamics; impact on a victim; per-sistence of agent set; possibility of exposure; source address validity; degree of automation. Defense mechanisms are de-termined in the environment by the following parameters: deployment location; mechanism of component cooperation;covered defense stages; attack detection technique; attack source detection technique; attack prevention and counteractiontechnique; model data gathering technique; determination of deviation from model data technique.The environment allows to analyze various classes of attack and defense mechanisms. The abstract and poster is de-voted to the investigation of the models of cooperation between distributed defense teams: (1) filter-level cooperation: theteam whose network is under attack can apply filtering rules on the filters of other teams; (2) sampler-level cooperation:the team whose network is under attack can get the traffic information from the samplers of other teams; (3) “poor” coop-eration: the teams can get the traffic information from the samplers of some other teams and apply filtering rules on thefilters of some other teams (each team knows a subset of other teams depending on the cooperation degree); (4) “full” co-operation: the team whose network is under attack can get the traffic information from all samplers of other teams and ap-ply filtering rules on all filters of other teams. Such cooperation schemas are used in the cooperative DDoS defense meth-ods: COSSACK, Perimeter-based DDoS defense, DefCOM, Gateway-based, ACC pushback, MbSQD, SOS, tIP routerarchitecture, etc. The cooperation schemas can be investigated and compared using the analysis of various parameters: (1)incoming traffic before and after filters; (2) normal and attack traffic rate from the whole traffic coming into defended net-work; (3) false positives and false negatives rates, (4) detection and reaction times, etc.In the future research we are planning to expand the attacks and defenses library, elaborate particular components func-tionalities. The important constituent of future research is numerous experiments to investigate various attacks, defensemechanisms (attack prevention, detection, tracing the attack sources and counteraction) and optimal defense combinations. Belief representation of network behavior in user and application context toimprove intrusion detection Pedro Bados [email protected] S.A., Switzerlandwww.nexthink.com In this work we are firstly exploring an innovative framework to represent the classical network information based onbinaries and user accounts contexts. At this point, we have developed a new generation of belief networks whichmodel the usage of these applications by a user or a community of users within the corporate environment. Ourresearch has been initially implemented as the core technology of REFLEX solutions and already validated in severalcustomer deployments. In our preliminary studies, we have identified that most commercial and research intrusion detection systems have afocus either on pure network traffic analysis or on a host behavior inspection. It’s well-known that both approachespresent particular advantages in accuracy and performance but at the same time they might have some drawbacksconcerning operational aspects. We have also observed that since both alternatives are complementary and commonlyused together, most of these shortcomings are aggregated often resulting in unmanageable systems. In the research performed at NEXThink we have designed and developed a new hybrid context where every singlenetwork activity is transparently associated with a user account and application binary. This association is performedwith certainty in 100% of cases. It’s important to note that most of the research and market approaches to analyzenetwork traffic are based on protocol flows to guess applications and on IP addresses to identify the users. In ourframework, applications are the real binaries names running at end-points while the users are the real user accountslogged in the machine. Moreover, our technique of classification has been proved to silently perform on real-time fornetworks of thousands of computers. At this point, our analysis engine stores and builds up a centralized model of network behavior for users andapplications. This model corresponds to a belief network of the usage of this application by this concrete user account.Finally, an analyzer is able to detect and measure deviations in the models generating alarms of abnormal usage andproviding explanations of such anomalies. Belief networks [1] are representations of Bayesian relationships between evidences and facts. They are currentlyused in several domains to reason under uncertainty such as illness prediction in medicine or incident management.Two important properties make them especially suitable for our objective. Firstly, their expressiveness permits thesecurity administrator to have a concrete explanation about the true or false positive when receiving an alarm. Wehave observed that meaningful explanations for abnormal events greatly reduce the reaction time in largeorganizations. Secondly, belief networks allow enriching the model with the insertion of new evidences and relationswithout changing or retraining the system. Bayesian representations can be easily understood and extended to accommodate new types of detections. Since theyactually represent the expert knowledge about a certain domain, they can be adapted and enhanced with userinteraction in a very intuitive way. This simple link between the detection model and the end-user represents a veryattractive field for risk analysis. The advantages of this new framework for the security field are evident. The fact of having all network eventsassociated to a certain executable name, version and their user account dramatically increases the detection accuracyof virus propagation, port scanning or abuse of privileges. At the same time we have observed that these parametersalong with the simple explanations become capital for administrators to quickly discard irrelevant events from the listof abnormal incidences. In addition, the fact of modeling the network behavior for a certain user account sheds some light on a challengingresearch field as identity theft detection. We have proved that a user network behavior model is unique in 96% of thecases and therefore sufficient to identify the person behind the computer while the discrimination of the rest or usersis successfully achieved in 90% of the situations. [1] Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. MorganKaufmann. A New Freely Available Data Set for the Evaluation of Signature-Based IDS Frédéric MassicotteFrançois GagnonCommunications Research CentreCarleton University3701 Carling Avenue1125 Colonel By DriveP.O. Box 11490, Stn. H Ottawa, ON, Canada K1S 5B6Ottawa, ON, Canada K2H 8S2 An Intrusion Detection System (IDS) is a critical aspect of a network security posture. Althoughtoday there are many IDS products available, the data used to evaluate these IDS is either proprietary orobsolete. For instance, even though no new data set has been released by DARPA since 2000, the DARPAtraffic traces are still used today by the security community since they represent the only significant andfreely available research data. In this poster, we present an automatically generated and publicly availabledata set that can be used for the evaluation of signature-based IDS. The current version of our data set was developed to test and evaluate network and signature-based IDSfor attack scenario recognition. It contains traffic traces of attack scenarios derived from the execution ofwell-known Vulnerability Exploitation Programs (VEP) taken from various sources such as SecurityFocus,Metasploit, Nessus and Operator. The data set is a collection of tcpdump traffic traces, each tracecontaining one attack scenario. To generate the data set, we used 124 VEP (covering a total of 92vulnerabilities) and 108 different target systems. Each VEP was launched against vulnerable and non-vulnerable systems (among the 108 target systems) that offer a service on the port targeted by the VEP.Each combination (VEP + configuration + target) corresponds to an attack scenario and produces atraffic trace in the data set. The 124 VEP are distributed among 17 different ports and the data setcontains more than 10000 traffic traces. The data set contains a vast diversity of attacks such as bufferoverflow, information leak, privilege gain and denial of service. In particular, these attacks use differentremote access techniques after exploiting the vulnerability such as direct and reverse shell. Moreover,some attacks in the data set are not detected by the current version of Snort and Bro. In addition to thosetraffic traces, our data set also contains traces resulting from the application to the VEP of different IDSevasion techniques, such as Fragroute and Whisker. This part of the data set is used to verify whetherIDS are able to detect modified attacks. One of the main features of our data set is the documentation of each traffic trace. Each traffic trace isdocumented through four characteristics: the target system configuration (operating system and installedsoftware), the VEP configuration (options used), whether or not the target system has the vulnerabilityexploited by that VEP (based on the target configuration and the information available on SecurityFocusfor the VEP) and the actual success or failure of the attack attempt (based on the VEP output and theeffects on the target). Since the traffic traces contained in the data set are properly documented, theevaluation process can be automated. To automatically generate such a large scale documented and maintainable data set, we developeda controlled virtual network using VMware Workstation 5.0. This controlled virtual network allows usto record network traffic produced during attacks, control the network (e.g., traffic noise), to controlthe attack propagation (confinement), to use various heterogeneous target system configurations, andto quickly recover from attacks. It is flexible (it can easily apply IDS evasion techniques on attacks),updatable (it can easily incorporate new target configurations and new attacks) and completely automated(from virtual network setup, attack execution to the documentation of traffic traces). Our controlledenvironment is not only a database of virtual machines, it is also a layer on top of VMWare allowingto set up experiments that can be described using a script language. For example, this virtual networkinfrastructure was also used to provide data to the LEURRE and SCRIPTGEN projects of Eurecom. For now, the data set presented here can be used to evaluate signature-based IDS to see if they candetect known real attacks (the false negative rate) and if they can distinguish between successful andfailed attempts of an attack (false positive on failed attempt). Our data set has proven to be useful inseveral projects. For example, we found out that Snort and Bro were not able to distinguish successfulfrom failed attacks. By periodically updating and sharing the attack traces we generated with the researchcommunity in network security we could provide a recent common reference to evaluate IDS.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

APHRODITE: an Anomaly-based Architecture for False Positive Reduction

We present APHRODITE, an architecture designed to reduce false positives in network intrusion detection systems. APHRODITE works by detecting anomalies in the output traffic, and by correlating them with the alerts raised by the NIDS working on the input traffic. Benchmarks show a substantial reduction of false positives and that APHRODITE is effective also after a “quick setup”, i.e. in the re...

متن کامل

The prevalence of adverse obstetric outcomes follow of an abnormal maternal serum AFP and NTDs screen positive group

Introduction: Prenatal maternal serum AFP screening tests have some inherent false positive and false negative results. For NTDs Screening test (FTS) false positive rate (FPR) is about 1% and false negative rate (FNR) is 20%. A false positive result encourages by Anomaly scan and invasive diagnostic procedures (that poses lots of stress on women, too), while a false negative result lead to a N...

متن کامل

False Negative Fecal Occult Blood Test: Prozone Effect

Dear editor, the fecal occult blood test is the presently widely used screening laboratory test for colorectal cancer. At present, the test is usually based on an immunological diagnostic principle (1, 2). A false positive fecal occult blood is common and widely mentioned in literature. Nevertheless, the false negative is little mentioned in the paper. Here, the authors discuss the issue of the...

متن کامل

Neural Network Based Protection of Software Defined Network Controller against Distributed Denial of Service Attacks

Software Defined Network (SDN) is a new architecture for network management and its main concept is centralizing network management in the network control level that has an overview of the network and determines the forwarding rules for switches and routers (the data level). Although this centralized control is the main advantage of SDN, it is also a single point of failure. If this main contro...

متن کامل

درآمدی بر تاریخ ذهنیت عامه در معماری ایران

The architectural works that remained from the long history of Iran are indeed treasures of Iranian architecture. However, these works are not perfect manifestations of the architecture which had been realized in Iran during centuries. Most of what we have inherited from this architecture are monuments. Such majestic works can hardly lead us to the major part of the architecture, which is popul...

متن کامل

بررسی میزان توافق بین‌فردی در تشخیص تغییرات غیرطبیعی سلول‌های پوششی دهانه رحم در نمونه‌های پاپ‌اسمیر

800x600 Normal 0 false false false EN-US X-NONE AR-SA MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal" mso-tstyle-rowband-size:0 mso-tstyle-colband-size:0 mso-style-noshow:yes mso-style-priority:99 mso-style-parent:"" mso-p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006